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Abstract 



We give an elementary proof of a generalization of Bourgain and Tzafriri's Restricted 
Invertibility Theorem, which says roughly that any matrix with columns of unit length and 
bounded operator norm has a large coordinate subspace on which it is well-invertible. Our 
proof gives the tightest known form of this result, is constructive, and provides a deterministic 
polynomial time algorithm for finding the desired subspace. 

1 Introduction 



In this note we study the following well-known theorem of Bourgain and Tzafriri. 

Theorem 1 (Restricted Invertibility [3]). There are universal constants c,d > 0, such that 
whenever L is a linear operator on with \\Lei\\ = 1 for the canonical basis vectors {ei}i< n , 
one can find a subset a C [n] of cardinality 



\a\ > cn/\\L\\2 



for which 



for all scalars {cij}j GfT . 



^ aiLe, 



idcr 



> 



d^ \cLj 



(1) 



This theorem has had significant applications in the local theory of Banach spaces and in the 
study of convex bodies in high dimensions. It is also considered a step towards the resolution of 
the famous Kadison-Singer conjecture, which asks if there exists a partition of [n] into a constant 
number of subsets a\, . . . , for which (TT]) holds. Recently, the theorem has attracted attention 
in numerical analysis due to its connection with the column subset selection problem, which 
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seeks to select a 'representative' subset of columns from a given matrix. In particular, Tropp [6] 
has developed a randomized polynomial time algorithm which finds the subset a efficiently. 

Bourgain and Tzafriri's proof of Theorem [T] uses probabilistic and functional analytic tech- 
niques and is non-constructive. In the original paper the theorem was shown to hold for 
c = d ~ Later on [3], the same authors proved it for c = c(e) = c'e 2 and d = (1 + e) _1 for 
every < e < 1, where c' is a universal (tiny) constant. They were interested in the case when 
e is small; the quadratic dependence of c(e) on e was shown to be necessary in [2]. In another 
regime, modern methods can be used to obtain the constants c = 1/128 and d = l/8\/27r [51 E]- 

In this note, we present a short proof that uses only basic linear algebra, achieves much better 
constants, and contains a deterministic 0(n 4 ) time algorithm for finding the set a. Our method 
of proof involves building a iteratively using a 'barrier' potential function. Such a method was 
used by Batson and the authors in [1] to construct linear size spectral sparsifiers of graphs. 

Specifically, we prove the following generalization of Theorem [H in which || • ||2 refers to the 
spectral (i.e., operator) norm and || • 1 1 ^ refers to the Frobenius (i.e., Hilbert-Schmidt) norm. 



Theorem 2. Suppose v\, . . . v m G l n , Yli v i v T = I> an ^ < e < 1. Let L : £2; — > P\ he a linear 

\L\\ 



operator. Then there is a subset a C [m] of size \a\ > e 2 , for which {Lvi}i ea is linearly 



12 



independent and 

^ ;i- e ) 2 l|L" 2 



Amin y~]( Lv i)( Lv i) T > 



F 



v Ida 



m 



where X m \ n is computed on span{Lvi}i ea . 

This form of generalization was introduced by Vershynin [7] in his study of contact points 
of convex bodies via John's decompositions of the identity. It says that given any such decom- 
position and any L : 1% — > there is a part of the decomposition on which L is well-invertible 

II ni2 

whose size is proportional to the stable rank , ' ,ff . 

The original form of Bourgain and Tzafriri's theorem follows quickly from Theorem [2] with 
constants 

c(e) = e 2 and d(e) = (1 - e) 2 

by taking {vi} from the standard basis {ei}i< n and assuming ||Lej|| = 1. This dominates previous 
bounds in all regimes, for e small and large. 

2 Proof of the Theorem 

We will build the matrix A = ^2i £rT (Lvi)(Lvi) T by an iterative process that adds one vector to 
o~ in each step. The process will be guided by the potential function^ 

$ b {A) = Y,{Ln i ) T {A-bI)-\Lv i ) 

i 

= Tr [L T (A - biy 1 !] since ^ v iV J = /, 



1 This potential function was inspired by Stieltjes transform, which appears in the analysis of the eigenvalues 
of random matrices. However, we are unaware of a formal connection. This potential function is also related to, 
but is not identical to, the logarithmic barrier function used in Interior Point Algorithms for Linear Programming. 
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where the barrier 6 is a real number that varies from step to step. 
Initially ^4 = 0, the barrier is at b = &o > 0, and the potential is 

$ 6o (0) = Tr [L T (0 - b I)- l L] = — Tr [L T L] /b = 

Each step of the process involves adding some rank-one matrix ww T to A where w 6 {Lv i}i< m 
(if w = Lvj then this corresponds to adding j to a) and shifting the barrier towards zero by 
some fixed amount 5 > 0, without increasing the potential. Specifically, we want 

<S> b - 5 (A + ww T ) < <S> b {A). 

We will maintain the invariant that after k vectors have been added, A has exactly k nonzero 
eigenvalues, all greater than b. Keeping the potential small (in fact, sufficiently negative) will 
ensure that there is a suitable vector to add at each step. 

In any step of the process, we are only interested in vectors w which add a new nonzero 
eigenvalue that is greater than b' = b — 5. These are identified in the following lemma, where 
the notation Ay B means that A — B is positive semidefinite. 

Lemma 3. Suppose A >z has k nonzero eigenvalues, all greater than b' > 0. and 

w T (A-b'l)~ 1 w < -1 (2) 
then A + ww T has k + 1 nonzero eigenvalues greater than b' . 

Proof. Let Ai > • • • > be the nonzero eigenvalues of A, and let X[ > ■ ■ ■ > X' k+1 be the k + \ 
largest eigenvalues of A + ww T . As the latter matrix is obtained from A by the addition of a 
rank one positive semi-definite matrix, their eigenvalues interlace PQ: 

A'i > Ai > A 2 > • • • > Afc > A fe+1 . 

Consider the quantity 

i<k i>k 

where we have written the positive and negative terms in the sum separately. By the Sherman- 
Morisson formula, 

Tr [( a + W - ,rr'] - x r [iA - « ly >] = - ^Ca%-w < 3 > 

Since w T (A — b'I)~ 1 w < —1, the denominator in the right-hand term is negative. The numerator 
is positive since A — b'l is non-singular and (^4 — b'I)~ 2 >z 0. So, the right-hand side of ([3]) is 
positive. 

On the other hand, a direct evaluation of this difference yields 
< Tr [(A + ww T - b'l)- 1 ] - Tr [(A - b'l)- 1 } 

1 



1 1 1 

- Ai+i _ y " o = y + E A^y - E 

< — — + -- since — — — — < for all i by interlacing. 

\' k+1 -b' V Xr-V \i-V 

As X' k+1 > 0, this is only possible if A' fc+1 > b 1 \ as desired. □ 
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The updated potential after one step, as the barrier moves from b to b' = b — 5, can be 
calculated using the Sherman-Morisson formula: 



MA + ww T ) = Tr [L T (A - b'l + 



,.T\-1; 



r T , / s i i Ti\L T (A-b , I)- 1 ww T (A-b'I)- 1 L] 

L v ; J 1 + w T (^4 - UI)- x w 

w T (A-b'I)- 1 LL T {A-b'I)- 1 w 
~ b ' ( ' l + vFiA-Vl)- 1 ™ ' 

To prevent an increase in potential, we want choose a w such that 

w T {A-b'I)- 1 LL T {A-b'I)- 1 w ^, A . 

We can now determine how small we need the potential to be in order to guarantee that a 
suitable w, which will allow us to keep on going, always exists. 

Lemma 4. Suppose A has k nonzero eigenvalues, all of which are greater than b, and let Q be 
the orthogonal projection onto the kernel of A. If 

II r ||2 

*&(A) <-m- JU!* (5) 
o 

and 

0< S<b< 8&f (6) 

then there exists a vector w G {Lvi}i< m for which A + ww T has k+1 nonzero eigenvalues greater 
than b' = b- S and <&y(A + ww T ) < $b(A). 

Proof H The vectors satisfying both of the inequalities ([2]) and dH) are precisely those w for which 

w T (A - b'l)~ 1 LL T (A - b'l)- 1 ™ 

< (MA) - M(A)) ■ (-1 - w T (A - b'l)- 1 ™). 

We can show that such a w exists by taking the sum over all w £ {Lvi}i< m an d ensuring that 
the inequality holds in the sum, i.e., that 

Tr [L T (A - b'l)- 1 LL T (A - b'l^L] 

< (MA) - MA)) • (-m - Tr [L T (A - b'l^L)] ). (7) 

||£||2 

Let Ab := ^b(A) — $t,'(A). From the assumption $b(A) < — m — ^-j^ 2 - we immediately have 

III II 2 

Tr [L T (A - b'l)- 1 !,} = MA) - At < -m - ^-r^ - A fe 



"We would like to thank Pete Casazza for pointing out an important mistake in an earlier version of this proof. 
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and so © will follow from 

Tr [L T (A - b'lr'L^iA - b'iy l L] < A b ■ (M^. + A b ) . (8) 

Noting that LL T X we can bound the left hand side as 

Tr [L T {A - b'iy 1 LL T {A - \/l)~ x L] < \\L\\ 2 Tv [L T (A - b'iy 2 L] . (9) 

Let P be the projection onto the image of A and let Q be the projection onto its kernel, so that 
P + Q = I. Let = Tr [L T P(A - b'I)- l PL] and = Tr [L T Q(A - b'I)- l QL] be 

the potentials computed on these subspaces. Since P, Q, A, (A — b'ly 1 , and (.A — b'I)~ 2 are 
mutually diagonalizable, we can write 

* V {A) = (A) + <D g (A) , A 6 = Af + A? , and 

Tr [L T (A - b'I)- 2 L] = Tr [L T P{A - b'iy 2 PL] + Tr [L T Q(A - b'I)- 2 QL] . 
As P(A - b'I)- l P y and P(A - bI)- l P y 0, it is easy to check that 

(b - b')P{A - b'I)- 2 P < P(A - bl)- 1 ? - P{A - b'iy l p 

which immediately gives 

||L|||Tr \L T P(A - b'iy 2 PL\ < Af (10) 

o 

Thus, by (jHJ), ©, and (fTUf) . we are done if we can show that 

||L|||Tr [L T Q(A - b'iy 2 QL] < (Af + A?) • (M + A 6 ) - Af M. 



Taking into account that Af , A^ > 0, this is implied by the statement 

||L|||Tr [L T Q(A - b'iy 2 QL] < A? • + A <A . (n) 

We now compute Tr [L T Q(A - b'iy 2 QL] = and 

A? = Tr [L r Q((A - biy 1 -(A- b'iy x ))QL] = 5 1 ^- 



which upon substituting and rearranging reduces (jTTjl to 



|L|| 2 < §g^| 



which we have assumed in (O. □ 
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Proof of Theorem [IJ We set 

m em 
Requirement ([5]) of Lemma [5] is satisfied at the beginning of the process as 

<M0) = -^ = -m-Hl 



To verify that requirement © is satisfied initially, first note that the theorem is vacuously true 
if e 2 lyfif < 1. Assuming the conven 
implies that 5 < 6q. The inequality 



IIMI2 ||£||2 

if e lr ,f < 1. Assuming the converse and recalling that e < 1, we may show ,, ,,f > 1/e which 



<£ J 



F 



ll/~ll 2 
11-^112 

is initially true as A = and so Q = Projker(A) = I- 

As long as condition © is satisfied, we may apply Lemma 2] to add a vector to a while 
maintaining < <3?6 (0). The left-hand inequality in ([6]) will be satisfied after the first t — 1 

steps if 

5 < 6 = 6 - (t - 1)5 <^ tS < b . 
This inequality is satisfied for all t < e 2 |^|f as 



e(l- C )l|Ll||. 

7T7TT" — < "0- 

|L||2 m 



The right-hand inequality in ([6]) will always be satisfied if it is satisfied initially as the Frobenius 

imp 

norm ||QL[||, decreases by at most in each step. Taking t = e 2 p|f steps leaves the 

barrier at 

(l-e)\\L\\ 2 F J|L||J. _ (l-e) 2 ^ 



b - St > ± y " "* - e"(l - eV 

m em m 

which is the promised bound. □ 
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